Philosophical Transactions of the Royal Society B
● The Royal Society
Preprints posted in the last 7 days, ranked by how well they match Philosophical Transactions of the Royal Society B's content profile, based on 51 papers previously published here. The average preprint has a 0.20% match score for this journal, so anything above that is already an above-average fit.
Undurraga Lucero, J. A.; Chesnaye, M.; Simpson, D.; Laugesen, S.
Show abstract
Objective detection of evoked potentials (EPs) is central to digital diagnostics in hearing assessment and clinical neurophysiology, yet current approaches remain time-intensive and sensitive to inter-individual noise variability. Many existing detection methods rely on population-based assumptions or computationally demanding procedures, limiting robustness and efficiency in real-world clinical settings. We present Fmpi, a digital EP detection framework enabling individualised, real-time response detection through analytical modelling of the spectral colour and temporal dynamics of background noise within each recording. Using extensive simulations and large-scale human electroencephalography datasets spanning brainstem, steady-state, and cortical EPs recorded in adults and infants, we demonstrate performance comparable or superior to state-of-the-art bootstrapped methods while operating at a fraction of the computational cost and maintaining well-controlled sensitivity with improved specificity. Importantly, Fmpi incorporates a futility detection mechanism enabling early termination of uninformative recordings, reducing testing time without compromising diagnostic reliability.
Strand, P. S.; Trang, J. C.
Show abstract
Female genital cutting (FGC) is identified within global health and human rights discourse as aligned with gender inequality and female disempowerment. The persistence of FGC in high-prevalence societies is assumed to reflect womens limited influence over decisions concerning their daughters. Yet anthropological research has questioned whether this interpretation adequately reflects how FGC is organized within practicing communities. Across two studies with 176,728 participants from 15 African and Asian countries, we examine whether mothers attitudes toward FGC predict daughters circumcision status and whether this relationship varies with regional FGC prevalence. Multilevel logistic regression models show that maternal attitudes strongly predict daughter circumcision status across both datasets. Contrary to expectations derived from disempowerment frameworks, the association between maternal attitudes and daughter outcomes is not weaker in high-prevalence contexts, it is stronger. These findings suggest that interpretations of FGC as reflecting female disempowerment may mischaracterize the social dynamics of societies in which FGC is common. Policy implications of the findings are discussed.
Luisto, R.; Snell, K.; Vartiainen, V.; Sanmark, E.; Äyrämö, S.
Show abstract
In this study, we investigate gender bias in a Retrieval-Augmented Generation (RAG) based AI assistant developed for Finnish wellbeing services counties. We tested the system using 36 clinically relevant queries, each rendered in three gendered variants (male, female, gender-neutral), and evaluated responses using both an LLM-as-a-judge approach and a human expert panel consisting of a physician and a sociologist specializing in ethics. We observed substantial and clinically significant differences across gendered variants, including differential treatment urgency, inappropriate symptom associations, and misidentification of clinical context. Female variants disproportionately framed responses around childcare and reproductive health regardless of clinical relevance, reflecting societal stereotypes rather than medical reasoning. Bias manifested both at the LLM generation stage and the RAG retrieval stage, in several cases causing the model to hallucinate responses entirely. Some bias patterns were persistent across repeated runs, while others appeared inconsistently, highlighting the challenge of distinguishing systematic bias from stochastic variation.
Bhansali, R.; Gorenshtein, A.; Westover, B.; Goldenholz, D. M.
Show abstract
Manuscript preparation is a critical bottleneck in scientific publishing, yet existing AI writing tools require cloud transmission of sensitive content, creating data-confidentiality barriers for clinical researchers. We introduce the Paper Analysis Tool (PAT), a free, multi-agent framework that deploys 31 specialized agents powered by small language models (SLMs) to audit manuscripts across multiple quality dimensions without external data transmission. Applied to three published clinical neurological papers, PAT generated 540 evaluable suggestions. Validation by two expert reviewers (R.B., A.G.) confirmed 391 actionable, high-value revisions (90% agreement), achieving a 72.4% overall usefulness accuracy spanning methodological, statistical, and visual domains. Furthermore, deterministic re-evaluation of 126 agent-suggested rewrite pairs using Phase 0 metrics confirmed text improvement: total word count decreased by 25%, passive voice prevalence dropped sharply from 35% to 5%, average sentence length decreased by 24%, long-sentence fraction fell by 67%, and the Flesch-Kincaid grade improved by 17% . Our validation confirms that systematic, agent-driven pre-submission review drives measurable improvements, successfully converting manuscript optimization from an opaque, manual endeavor into a transparent and rigorous scientific process. Manuscript preparation is a critical bottleneck in scientific publishing, yet existing AI writing tools require cloud transmission of sensitive content, creating data-confidentiality barriers for clinical researchers. We introduce the Paper Analysis Tool (PAT), a free, multi-agent framework that deploys 31 specialized agents powered by small language models (SLMs) to audit manuscripts across multiple quality dimensions without external data transmission. Applied to three published clinical neurological papers, PAT generated 540 evaluable suggestions. Independent validation by two expert reviewers (R.B., A.G.) confirmed 391 actionable, high-value revisions (90% agreement), achieving a 72.4% overall usefulness accuracy spanning methodological, statistical, and visual domains. Furthermore, deterministic re-evaluation of 126 suggested Phase 0 rewrite pairs confirmed text improvement: total word count decreased by 25%, passive voice prevalence dropped sharply from 35% to 5%, average sentence length decreased by 24%, and long-sentence fraction fell by 67%, and the Flesch-Kincaid grade improved modestly. Our validation confirms that systematic, agent-driven pre-submission review drives measurable improvements, successfully converting manuscript optimization from an opaque, manual endeavor into a transparent and rigorous scientific process.
Dai, H.-J.; Mir, T. H.; Fang, L.-C.; Chen, C.-T.; Feng, H.-H.; Lai, J.-R.; Hsu, H.-C.; Nandy, P.; Panchal, O.; Liao, W.-H.; Tien, Y.-Z.; Chen, P.-Z.; Lin, Y.-R.; Jonnagaddala, J.
Show abstract
Accurate recognition and deidentification of sensitive health information (SHI) in spoken dialogues requires multimodal algorithms that can understand medical language and contextual nuance. However, the recognition and deidentification risks expose sensitive health information (SHI). Additionally, the variability and complexity of medical terminology, along with the inherent biases in medical datasets, further complicate this task. This study introduces the SREDH/AI-Cup 2025 Medical Speech Sensitive Information Recognition Challenge, which focuses on two tasks: Task-1: Speech transcription systems must accurately transcribe speech into text; and Task-2: Medical speech de-identification to detect and appropriately classify mentions of SHI. The competition attracted 246 teams; top-performing systems achieved a mixed error rate (MER) of 0.1147 and a macro F1-score of 0.7103, with average MER and macro F1-score of 0.3539 and 0.2696, respectively. Results were presented at the IW-DMRN workshop in 2025. Notably, the results reveal that LLMs were prevalent across both tasks: 97.5% of teams adopted LLMs for Task 1 and 100% for Task 2. Highlighting their growing role in healthcare. Furthermore, we finetuned six models, demonstrating strong precision ([~]0.885-0.889) with slightly lower recall ([~]0.830-0.847), resulting in F1-scores of 0.857-0.867.
Smah, M. L.; Seale, A.; Rock, K.
Show abstract
Infectious disease dynamics are strongly shaped by human mobility, social structure, and heterogeneous contact patterns, yet many epidemic models do not jointly capture these features. This study develops a spatial metapopulation epidemic model incorporating recurrent group-switch interactions to represent real-world transmission processes. Building on the Movement-Interaction-Return framework, the model integrates household structure, age-stratified contacts, and mobility between locations within a single SEIR framework. Using UK demographic, mobility, and social contact data, the model quantifies how within- and between-group interactions, mobility rates, and location connectivity influence epidemic spread. Both deterministic and stochastic simulations are implemented to analyse outbreak dynamics, variability, and fade-out probabilities for COVID-19-like and Ebola-like infections. Results shows that highly connected locations drive faster transmission, earlier epidemic peaks, and greater difficulty in containment, whereas larger but less connected locations tend to produce slower, more localised outbreaks despite their population size. Comparative analysis reveals that COVID-19-like infections spread rapidly and remain difficult to control even under interventions, while Ebola-like infections exhibit slower dynamics and are more effectively contained, particularly under targeted measures. Non-pharmaceutical interventions, particularly widespread closures, substantially reduce infections, hospitalisations, and deaths, although effectiveness depends on timing and pathogen characteristics. These findings highlight the importance of integrating mobility, clustering, and demographic heterogeneity to inform targeted and effective epidemic control strategies.
Tampubolon, G.
Show abstract
Population ageing increases the importance of cognitive capacity for making decisions about retirement and living independently beyond it. We tested whether post-war educational expansion and working-life social mobility eliminate the association between social class of origin and cognition in early old age using the 1958 National Child Development Study. Two outcomes were analysed at age 62: standard episodic memory (immediate + delayed word recall) and long-term episodic memory, capturing accurate half-century recall of childhood household facts (rooms and people at age 11 validated against mothers' responses). Social mobility trajectories derived in prior work were classified into predominantly manual versus non-manual class trajectories. Models were estimated separately for women and men across three specifications: (i) social origin and controls, (ii) adding social mobility, and (iii) adding weighting to address healthy survivor bias. Education was consistently associated with both outcomes. For long-term episodic memory, social origin gradients were clearer than for short-term episodic memory, with men from service/professional origins showing a 13 percentage-point higher probability of accurate half-century recall than men from manual origins. These findings indicate that education expansion and working-life social mobility failed to release the grip of social origin on long-term episodic memory.
Challier, V.; Diebo, B.; Lafage, V.; Dehouche, N.; Lonjon, G.; Cristini, J.; SpineDAO,
Show abstract
Study Design: Prospective observational study using a novel digital ledger technology (DLT)-based crowdsourcing platform. Objective: To develop and evaluate Spine Reviews, a blockchain-based platform for aggregating spine treatment recommendations from an international specialist panel, and to validate the clinical coherence of the resulting dataset. Summary of Background Data: Predictive models for low back pain treatment are limited by small, homogeneous datasets that fail to capture inter-clinician variability. Traditional multi-center data collection is expensive, slow, and geographically constrained. DLT-based crowdsourcing with cryptographic credentialing may overcome these barriers. Methods: Five hundred synthetic patient vignettes (digital twins) were generated; 463 retained after quality control. A review platform was built on the Solana blockchain using non-transferable Soulbound Tokens (SBTs) for credentialing and smart-contract compensation. Fifty-two specialists from 7 countries provided 4+ reviews per vignette across four treatment tiers, without access to imaging or physical examination. Mixed-effects regression with reviewer random intercepts partitioned decision variability. Results: The platform collected 2,066 completed reviews (97.7%) over 37 days at USD 0.97/review. Variance decomposition revealed that 36.7% of treatment tier variability was attributable to patient presentation, 19.2% to reviewer practice style, and 44.1% to their interaction. Neurological deficits (beta=0.39), symptom duration (beta=0.12), and pain (beta=0.09) independently predicted treatment escalation (all p<0.001). Gwet's AC1 was almost perfect for emergency (0.92) and substantial for conservative decisions (0.67). Reviewer confidence in treatment recommendations decreased with escalating tier severity (conservative 4.59/5 vs surgical 4.05/5), suggesting appropriate uncertainty calibration. Conclusions: DLT with SBT credentialing enables rapid, global, cost-effective aggregation of clinically coherent expert judgment. The three-component variance structure quantifies clinical equipoise in spine care and establishes that predictive models require diverse, multi-reviewer training data. Keywords: digital ledger technology; blockchain; crowdsourcing; clinical decision-making; low back pain; Soulbound Tokens
Vaportzis, E.; Edwards, W.
Show abstract
This study investigated retirement adjustment in retired police officers in the UK (N = 289), examining how time since leaving the service moderates the relationship between perceived organisational support and retirement adjustment while accounting for resilience. Results indicated a developmental trend: organisational support remains stable initially but becomes increasingly influential in later life. Using Johnson-Neyman analysis, a threshold of 32.07 years was identified, after which the association reaches statistical significance. These findings suggest an organisational legacy effect; for the older generation, the retrospective perception of being valued by the service acts as a durable psychological resource. This study offers a novel conceptualisation of long-term organisational influence by identifying a temporally delayed legacy effect that extends beyond existing models of retirement adjustment. The study advocate for lifelong wellbeing strategies that extend, recognising that the organisational relationship continues to shape adjustment outcomes decades after the conclusion of active duty.
Challier, V.; Jacquemin, C.; Diebo, B.; Dehouche, N.; Denisov, A.; Cristini, J.; Campana, M.; Castelain, J.-E.; Lonjon, G.; Lafage, V.; Ghailane, S.; SpineDAO Collaborative Group,
Show abstract
BackgroundSynthetic data have emerged as a complementary strategy for secondary use of clinical registries, enabling data sharing without patient-level exposure. In spine surgery, multicenter data sharing is constrained by institutional governance and patient privacy regulations. Validated synthetic data generation may enable broader access to surgical outcomes data for artificial intelligence development without compromising patient confidentiality. ObjectiveTo describe and benchmark a three-domain validated synthetic data pipeline applied to a multicenter, tokenized spine surgery registry (SpineBase), and to establish a reproducible certification framework for synthetic spine surgery datasets. MethodsWe extracted 125 sacroiliac joint fusion cases from the SpineBase registry (SIBONE study, IRB-SOFCOT approval Ref. 14-2025; CNIL MR-004 Ref. 2234503 v 0). A GaussianCopula generative model was trained on 52 structured variables spanning demographics, preoperative assessments, operative details, and longitudinal outcomes at 3, 6, 12, and 24 months. Synthetic datasets of 100, 1,000, and 10,000 patients were generated. Validation followed a three-domain framework: (1) fidelity, assessed by Kolmogorov-Smirnov tests and Jensen-Shannon divergence; (2) utility, assessed by train-on-synthetic, test-on-real (TSTR) methodology; and (3) privacy, assessed by nearest-neighbor distance ratio (NNDR), membership inference attack, and k-anonymity proxy. ResultsAll three validation gates passed. Fidelity: mean KS p-value 0.52 (threshold >0.05). Privacy: NNDR >1.0 in 98.9% of synthetic records; membership inference AUROC 0.57. Utility: 12-month Oswestry Disability Index prediction yielded Pearson r = 0.29, consistent with expected attenuation at N = 125. A SHA-256 cryptographic hash of each certified dataset was anchored on the Solana blockchain for immutable provenance. ConclusionsA validated, blockchain-anchored synthetic data pipeline for spine surgery registries is technically feasible and meets current publication-standard criteria for fidelity and privacy. Utility metrics scale with registry size, creating a direct incentive for multicenter data contribution. This framework provides a reproducible methodology for synthetic data certification in spine surgery research, and establishes certified synthetic datasets as a privacy-native substrate for expert-annotation pipelines -- as demonstrated in the companion Spine Reviews study.
Yang, Z.; Lyng, G. D.; Batra, S. S.; Tillman, R. E.
Show abstract
Medical concept extraction from electronic health records underpins many downstream applications, yet remains challenging because medically meaningful concepts, such as diagnoses, are frequently implied rather than explicitly stated in medical narratives. Existing benchmarks with human-annotated evidence spans underscore the importance of grounding extracted concepts in medical text. However, they predominantly focus on explicitly stated concepts and provide limited coverage of cases in which medically relevant concepts must be inferred. We present MedicalBench, a new benchmark for medical concept extraction with evidence grounding that evaluates implicit medical reasoning. MedicalBench formulates medical concept extraction as a verification task over medical note concept pairs, coupled with sentence level evidence identification. Built from MIMIC-IV discharge summaries and human verified ICD-10 codes, the dataset is curated through a multi stage large language model (LLM) triage pipeline followed by medical annotation and expert review. It deliberately includes implicit positives, semantically confusable negatives, and cases where LLM judgments disagree with medical expert assessments. Annotators provide sentence level evidence spans and concise medical rationales. The final dataset contains 823 high quality examples. We define two complementary evaluation tasks: (1) medical concept extraction and (2) sentence level evidence retrieval, enabling assessment of both correctness and interpretability. Benchmarking state-of-the-art LLMs and a supervised baseline reveals that performance remains modest, highlighting the difficulty of extracting implicitly expressed concepts. We further show that explicitly incorporating reasoning cues and prompting to extract implicit evidence substantially improves medical concept extractions, while performance is largely invariant to note length, indicating that MedicalBench isolates reasoning difficulty rather than superficial confounders. MedicalBench provides the first systematic benchmark for implicit, evidence-grounded medical concept extraction, offering a foundation for developing medical language models that can both identify medically relevant concepts and justify their predictions in a transparent and medically faithful manner.
Gartlehner, G.; Banda, S.; Callaghan, M.; Chase, J.-A.; Dobrescu, A.; Eisele-Metzger, A.; Flemyng, E.; Gardner, S.; Griebler, U.; Helfer, B.; Jemiolo, P.; Macura, B.; Minx, J. C.; Noel-Storr, A.; Rajabzadeh Tahmasebi, N.; Sharifan, A.; Meerpohl, J.; Thomas, J.
Show abstract
Background: Artificial intelligence (AI) has the potential to improve the efficiency of evidence synthesis and reduce human error. However, robust methods for evaluating rapidly evolving AI tools within the practical workflows of evidence synthesis remain underdeveloped. This protocol describes a study design for assessing the effectiveness, efficiency, and usability of AI tools in comparison to traditional human-only workflows in the context of Cochrane systematic reviews. Methods: Members of the Cochrane Evaluation of (Semi-) Automated Review (CESAR) Methods Project developed an adaptive platform study-within-a-review (SWAR) design, modeled after clinical platform trials. This design employs a master protocol to concurrently evaluate multiple AI tools (interventions) against a standard human-only process (control) across three key review tasks: title and abstract screening, full-text screening, and data extraction. The adaptive framework allows for the addition or removal of AI tools based on interim performance analyses without necessitating a restart of the study. Performance will be assessed using metrics such as accuracy (sensitivity, specificity, precision), efficiency (time on task), response stability, impact of errors, and usability, in alignment with Responsible use of AI in evidence SynthEsis (RAISE) principles. Results: The study will generate comparative data about the performance and usability of specific AI tools employed in a semi- or fully automated manner relative to standard human effort. The protocol provides a flexible framework for the assessment of AI tools in evidence synthesis, addressing the limitations of static, one-time evaluations. Discussion: This study protocol presents a novel methodological approach to addressing the challenges of evaluating AI tools for evidence syntheses. By validating entire workflows rather than individual technologies, the findings will establish an evidence base for determining the viability of integrating AI into evidence-synthesis workflows. The adaptive design of this study is flexible and can be adopted by other investigators, ensuring that the evaluation framework remains relevant as new tools emerge.
Khanna, S.; Chaudhary, R.; Narula, N.; Lee, R.
Show abstract
Lung cancer screening saves lives, yet uptake remains suboptimal and inequitable. Personalised communication can improve attendance and reduce anxiety, but scaling such support is a workforce challenge. We fine-tuned Googles Gemma 2 9B using QLoRA on 5,086 synthetic screening conversations and compared it against Googles Gemini 2.5 Flash (a larger frontier model) and an unmodified baseline across 300 multi-turn conversations with 100 patient personas spanning ten clinical categories. Evaluation combined automated natural language processing metrics with independent language model judgement in two complementary modes: structured clinical rubric and simulated patient persona. The fine-tuned model achieved the highest simulated patient experience score (3.71/5 vs 3.65 for the frontier model), recorded zero boundary violations after clinician review of all flagged instances, and led on the four most safety-critical categories. A composite Patient Adaptation Index showed that the fine-tuned model led overall (0.37 vs 0.35 vs 0.35), with its clearest advantage on the two clinically specific components: empathy calibration to patient distress and selective smoking cessation signposting. These findings suggest that targeted fine-tuning of open-source models can yield clinical communication quality comparable to larger proprietary systems, with advantages in safety-critical scenarios and suitability for NHS data governance constraints. Human clinician review of these conversations is ongoing.
Lin, R.; Halfwerk, F. R.; Donker, D. W.; Tertoolen, J.; van der Pas, V. R.; Laverman, G. D.; Wang, Y.
Show abstract
Objective: Skin sympathetic nerve activity (SKNA) has emerged as a promising non-invasive surrogate measure of sympathetic drive, but its relevant physiological characteristics remain ill-defined. This observational study aims to investigate its regulatory patterns during rest and Valsalva maneuver (VM) in healthy participants. Method: Using a two-layer strategy integrating signal analysis and physiological modelling, we analyzed data recorded from 41 subjects performing repeated VMs. The observational layer includes time-domain feature comparisons using linear mixed-effect models, and time-varying spectral coherence analysis. The mechanistic layer proposes a mathematical model to investigate whether baroreflex and respiratory modulation are sufficient to reproduce the observed HR and average SKNA (aSKNA) dynamics. Main Results: Mean integrated SKNA (iSKNA) showed more significant change than HRV for VM induced effects. We also found mean iSKNA increase during VM varies with BMI and sex. The coherence analysis indicated that iSKNA strongly synchronized with EDR under resting conditions. The proposed model successfully reproduced main characteristics of aSKNA dynamics, yielding a high median Pearson correlation coefficient of 0.80 ([Q1, Q3] = [0.60, 0.91]). In contrast, HR dynamics were only partially captured, with a median PCC of 0.37 ([Q1, Q3] = [0.16, 0.55]). These results likely suggest SKNA provides a more direct representation of sympathetic burst dynamics during VM in healthy subjects. Significance: This study provides convergent evidence that SKNA reflects known autonomic regulatory influences in healthy subjects. These findings strengthen the physiological interpretability of SKNA while clarifying its appropriate use as a practical biomarker of sympathetic function.
Fotso, J. C.; Togo, E.; Bidashimwa, D.; Adje, O. E.; Moumouni, N. A.
Show abstract
Family planning (FP) self-care is a strategic pillar for advancing Universal Health Coverage (UHC) and mitigating health workforce shortages. However, a significant disconnect persists between global normative frameworks and local implementation realities. This study examines the local meanings, perceptions, and experiences of FP self-care in Niger to inform contextualized scale-up of self-care interventions. We employed a sequential mixed-methods design in the Niamey (urban) and Zinder (rural) regions of Niger. A quantitative household survey was conducted with 510 women and 357 men to assess fertility awareness, method preferences, and information-seeking behaviors. This was complemented by qualitative in-depth interviews with 36 women, 18 men, 12 healthcare providers, and 15 community leaders. Quantitative data were analyzed using descriptive statistics, while qualitative transcripts underwent iterative thematic analysis mapped to global self-care frameworks. "Self-care" was locally reconstructed not as autonomy. While defined by all participants as hygiene, it was uniquely reconstructed by men and community leaders as economic provision. A distinct "medicalization paradox" emerged: women defined self-care as the agency to seek clinical dependence, prioritizing facility-based providers over community sources (e.g., 58.1% vs. 12.1% for oral contraceptives) to mitigate fears regarding product quality and side effects. Conversely, men favored Community Health Workers (34.3%) driven by logistical efficiency and economic motivations. Physiological knowledge was low; only 11.8% of women correctly identified the fertile window, with misconceptions reinforced by fatalistic narratives propagated by community gatekeepers. Furthermore, providers expressed strong skepticism regarding user competence, fearing "chaos" without medical supervision. Implementing FP self-care in Niger requires shifting from a "product-first" to a "values-first" approach. Strategies must be gender-stratified: leveraging "medicalized validation" to address womens safety concerns while utilizing community-based channels to meet mens efficiency needs. Ultimately, self-care should be framed not as independence from the health system, but as an empowered partnership with it.
Bauman, A.; Owen, K.; Messing, S.; Macdonald, H.; Nettlefold, L.; Richards, J.; Vandelanotte, C.; Chen, I.-H.; Cullen, B.; van Buskirk, J.; van Itallie, A.; Coletta, G.; O'Halloran, P.; Randle, E.; Nicholson, M.; Staley, K.; McKay, H. A.
Show abstract
Military aviation training noise remains understudied despite its widespread impacts across urban, rural, and wilderness areas. The predominance of low-frequency noise and repetitive training can create pervasive noise pollution, yet past research often fails to capture the full range of health and quality-of-life effects. This study analyzed two complaint datasets related to Whidbey Island Naval Air Station noise: U.S. Navy records (2017-2020) and Quiet Skies Over San Juan County data (2021-2023). We analyzed and mapped sentiment intensity from noise complaints relative to modeled annual noise exposure, developed a typology to classify impacts, and modeled the environmental and operational factors influencing complaints. Findings revealed widespread negative sentiment and anger, often beyond the bounds of estimated noise contours, suggesting that annual cumulative noise models inadequately estimate community impacts. Complaints consistently highlighted sleep disturbance, hearing and health concerns, and compromised home environments due to shaking, vibration, and disruption of daily life. Residents also reported significant social, recreational, and work disruptions, along with feelings of fear, helplessness, and concern for children's well-being. The number of complaints were strongly associated with training schedules, with late-night sessions being the strongest predictor. A delayed response pattern suggests residents reach a frustration threshold before filing complaints. Overall, our findings demonstrate persistent negative sentiment and diverse impacts from military aviation noise. Results highlight the need for improved noise metrics, modeling and operational adjustments to mitigate the most disruptive effects.
Shaetonhodi, N. G.; De Vos, L.; Babalola, C.; de Voux, A.; Joseph Davey, D.; Mdingi, M.; Peters, R. P. H.; Klausner, J. D.; Medina-Marino, A.
Show abstract
BackgroundCurable sexually transmitted infections (STIs), including Chlamydia trachomatis, Neisseria gonorrhoeae, and Trichomonas vaginalis, remain highly prevalent among pregnant women in South Africa. Despite poor diagnostic performance in pregnancy, syndromic management remains standard care. Point-of-care (POC) screening enables aetiological diagnosis and same-visit treatment but is not yet included in national guidelines. We conducted a mixed-methods process evaluation to examine determinants of antenatal POC STI screening implementation in public facilities. MethodsThis evaluation was embedded within the three-arm Philani Ndiphile randomized trial (March 2021-February 2025) across four public clinics in the Eastern Cape. Screening used a near-POC, electricity-dependent nucleic acid amplification test with a 90-minute turnaround time. Reach, Adoption, Implementation, and Maintenance were assessed using the RE-AIM framework. Quantitative indicators included uptake of screening, treatment, and follow-up attendance. Qualitative data included in-depth interviews with 20 pregnant women and five focus group discussions with 21 research staff and government healthcare workers. The Consolidated Framework for Implementation Research guided qualitative analysis. Findings were integrated using narrative weaving. ResultsScreening uptake was high (99.0%), with treatment coverage of 95.2% at baseline and 93.5% at repeat screening. Same-day treatment was lower (50.7% and 69.8%) and varied substantially by facility, reflecting operational constraints including turnaround time, patient volume, infrastructure, and electricity. Attendance was higher when screening was integrated into routine ANC. Women valued screening for infant health, while providers recognised advantages over syndromic management but highlighted workforce, resource, and maintenance constraints. Socioeconomic factors, including transport costs, hunger, and work commitments, influenced retention and waiting. ConclusionsAntenatal POC STI screening was acceptable and achieved high treatment coverage in a research setting. However, same-day treatment was constrained by operational requirements of the testing platform. Scale-up will require workflow integration, strengthened health system capacity, and faster diagnostics suited to routine antenatal care. Key MessagesO_ST_ABSWhat is already known on this topicC_ST_ABSSyndromic management remains standard antenatal care in many low-resource settings despite failing to capture up to 89% of infections that remain asymptomatic. Point-of-care aetiological screening has demonstrated feasibility, acceptability, and potential clinical benefit in research settings, yet has not been widely adopted into national policy. Limited evidence exists on the health system requirements and contextual determinants influencing scale-up within routine public facilities. What this study addsThis mixed-methods process evaluation demonstrates high uptake and treatment coverage of antenatal POC STI screening in a trial setting, while identifying facility-level, structural, and socioeconomic factors shaping same-day treatment and retention. We show that implementation success varies substantially across clinics and depends on assay characteristics, workflow integration, human resources, infrastructure reliability, and follow-up capacity. How this study might affect research, practice or policyThese findings provide implementation-relevant evidence to inform national policy deliberations on integrating POC STI screening into antenatal care. Sustainable scale-up will require context-adapted delivery models, strengthened workforce and supply systems, faster diagnostics, and alignment with existing ANC workflows to ensure equitable and durable impact.
Asplin, P.; Mancy, R.; Keeling, M. J.; Hill, E. M.
Show abstract
Symptom propagation occurs when the symptoms of secondary cases are related to those of the primary case as a result of epidemiological mechanisms. Determining whether - and to what extent - symptom propagation occurs requires data-driven methods. Here we quantify the strength of symptom propagation as the increase in risk of a secondary case developing severe symptoms if the primary case has severe symptoms. We first used synthetic results to determine the data requirements to robustly estimate the strength of symptom propagation and to investigate the effect of severity-dependent reporting bias. Categorising symptom severity into two group (mild or severe; asymptomatic or symptomatic), our estimation requires only four summary statistics - the number of primary-secondary case pairs of each combination of symptom presentations. Our analysis showed that a relatively small number (100) of synthetic primary-secondary case pairs was sufficient to obtain a reasonable estimate of the strength of symptom propagation and 1,000 pairs meant errors were consistently small across replicates. Our estimates were robust to severity-dependent reporting bias. We also explored how symptom propagation can be separated from other individual-level factors affecting severity, using age dependence as an example. Although synthetic data generated from an age-structured model led to overestimations of the strength of symptom propagation, allowing disease severity to be age-dependent restored the accuracy of parameter estimation. Finally, we applied our methodology to estimate the strength of symptom propagation from three publicly available data collected during the COVID-19 pandemic with data on presence or absence of symptoms: England households, Israel households, and Norway contact tracing. Our age-free methodology indicated a 12-17% increase in the risk of being symptomatic if infected by someone symptomatic. Our positive estimates for the strength of symptom propagation persisted when applying our age-dependent methodology to the two household data sets with age-structured information (England and Israel). These findings demonstrate evidence for symptom propagation of SARS-CoV-2 and provide consistent estimates for its strength. Our synthetic data analysis supports the conclusion that these correlations are not a result of reporting bias or age-dependent effects. This work provides a practical tool for estimating the strength of symptom propagation that has minimal data requirements, enabling application across a wide range of pathogens and epidemiological settings.
Blythe, R.; Senanayake, S.; Bylstra, Y.; Roberts, J.; Choi, C.; Yeo, M. J.; Goh, J.; Graves, N.; Koh, A. L.; Jamuar, S. S.
Show abstract
BackgroundCarrier screening for inherited genetic disorders can reduce the burden of conditions that lead to childhood morbidity and mortality, including thalassaemia, cystic fibrosis, and spinal muscular atrophy. To be successful, national carrier screening programs should aim to maximise uptake, which may depend on population preferences for screening characteristics. In this study, we aimed to determine how expanded carrier screening in Singapore should be designed based on operational factors including suggested copayments, wait times, and disorders included in screening panels. MethodsWe elicited stated preferences for the design of a hypothetical national carrier screening program with seven attributes from 500 Singaporeans of reproductive age (18 to 54). A discrete choice experiment was applied using 30 choice tasks with 3 alternatives per task, divided between 3 blocks. The mixed multinomial logit model was used to estimate willingness-to-pay for each attribute level. Predicted uptake for three plausible screening programs was assessed, with copayment amounts from $0 to $1,200 in increments of $30. Impact on the annual national budget was calculated as a function of 25,000 expected eligible couples per year. All costs were reported in 2026 SGD. ResultsRespondents showed the strongest preferences for cost, followed by the number of diseases included in the panel, then wait times, with limited impact of remaining attributes. With no copayments, predicted uptake ranged from 85% [95% CI: 83% to 87%] to 90% [88% to 92%] for the basic and utility-maximising screening programs, respectively. This declined to 61% [56% to 66%] and 69% [65% to 73%] and, respectively, at a copayment of $1,200 per test. The model predicted higher uptake if a selection of screening alternatives were available, compared to a single program. The budget impact was highly dependent on population eligibility, copayments, and couples decision-making processes, but was unlikely to exceed $22.5m [$19.0m to $26.6m] per year unless expanded beyond married couples. ConclusionsThere was high predicted demand for carrier screening even as copayments increased. Successful strategies to improve uptake may include reducing copays and wait times, increasing the number of screening options available to prospective parents, and increasing program eligibility beyond pre-conception married couples.
Kim, J.; Lee, S.; Nam, K.
Show abstract
A central question in psycholinguistics in visual word recognition is whether morphologically complex words are obligatorily decomposed into stems and affixes during visual word recognition or whether whole-word access can occur when forms are frequent and familiar. The present study investigated how morphological complexity and lexical frequency jointly shape neural responses by leveraging Korean nominal inflection, whose transparent stem-suffix structure permits a clean dissociation between base (stem) frequency and surface (whole-word) frequency. Twenty-five native Korean speakers completed a rapid event-related fMRI lexical decision task involving simple and inflected nouns that varied parametrically in both frequency measures. Representational similarity analysis (RSA) revealed robust encoding of surface frequency--but not base frequency--in the inferior frontal gyrus (IFG) pars opercularis and supramarginal gyrus (SMG), with significantly stronger correlations for inflected than simple nouns. Univariate analyses converged with this result: surface frequency selectively increased activation for inflected nouns in inferior parietal regions, whereas base frequency showed no reliable effects in any ROI. These findings challenge models positing obligatory pre-lexical decomposition, instead supporting accounts in which morphological processing is shaped by post-lexical, usage-driven lexical statistics. Taken together, our findings shed light on a distributed perspective on morphological processing, suggesting that structural and statistical factors jointly constrain access to morphologically complex forms.